Solving Goal Hybrid Markov Decision Processes Using Numeric Classical Planners
نویسنده
چکیده
We present the domain-independent HRFF algorithm, which solves goal-oriented HMDPs by incrementally aggregating plans generated by the Metric-FF planner into a policy defined over discrete and continuous state variables. HRFF takes into account non-monotonic state variables, and complex combinations of many discrete and continuous probability distributions. We introduce new data structures and algorithmic paradigms to deal with continuous state spaces: hybrid hierarchical hash tables, domain determinization based on dynamic domain sampling or on static computation of probability distributions’ modes, optimization settings under Metric-FF based on plan probability and length. We compare with HAO∗ on the Rover domain and show that HRFF outperforms HAO∗ by many order of magnitudes in terms of computation time and memory usage. We also experiment challenging and combinatorial HMDP versions of benchmarks from numeric classical planning, with continuous dead-ends and non-monotonic continuous state variables.
منابع مشابه
Approximate Policy Iteration with a Policy Language Bias (draft)
We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policy-language biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. In particular, we induce high-quality domain-specific planners for classical planning domains (both deterministic and...
متن کاملApproximate Policy Iteration with a Policy Language Bias
We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policy-language biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. In particular, we induce high-quality domain-specific planners for classical planning domains (both deterministic and...
متن کاملA Hierarchical Framework for Composing Nested Web Processes
Many of the previous methods for composing Web processes utilize either classical planning techniques such as hierarchical task networks (HTNs), or decision-theoretic planners such as Markov decision processes (MDPs). While offering a way to automatically compose a desired Web process, these techniques do not scale to large processes. In addition, classical planners assume away the uncertaintie...
متن کاملMulti-Threaded BLAO* Algorithm
We present a heuristic search algorithm for solving goal based Markov decision processes (MDPs) named Multi-threaded BLAO* (MBLAO*). Hansen and Zilberstein proposed a heuristic search MDP solver named LAO* (Hansen & Zilberstein 2001). Bhuma and Goldsmith extended LAO* to the bidirectional case (Bhuma & Goldsmith 2003) and named their solver BLAO*. Recent experiments on BLAO* (Dai & Goldsmith 20...
متن کاملContingent Planning Under Uncertainty via Stochastic Satisfiability
We describe a new planning technique that efficiently solves probabilistic propositional contingent planning problems by converting them into instances of stochastic satisfiability (SSat) and solving these problems instead. We make fundamental contributions in two areas: the solution of SSat problems and the solution of stochastic planning problems. This is the first work extending the planning...
متن کامل